Quality Scheme Assessment in the Clustering Process

نویسندگان

  • Maria Halkidi
  • Michalis Vazirgiannis
  • Yannis Batistakis
چکیده

Clustering is mostly an unsupervised procedure and most of the clustering algorithms depend on assumptions and initial guesses in order to define the subgroups presented in a data set. As a consequence, in most applications the final clusters require some sort of evaluation. The evaluation procedure has to tackle difficult problems, which can be qualitatively expressed as: i. quality of clusters, ii. the degree with which a clustering scheme fits a specific data set, iii. the optimal number of clusters in a partitioning. In this paper we present a scheme for finding the optimal partitioning of a data set during the clustering process regardless of the clustering algorithm used. More specifically, we present an approach for evaluation of clustering schemes (partitions) so as to find the best number of clusters, which occurs in a specific data set. A clustering algorithm produces different partitions for different values of the input parameters. The proposed approach selects the best clustering scheme (i.e., the scheme with the most compact and well-separated clusters), according to a quality index we define. We verified our approach using two popular clustering algorithms on synthetic and real data sets in order to evaluate its reliability. Moreover, we study the influence of different clustering parameters to the proposed quality index.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Water Quality Zoning of Rivers by the Technique of Fuzzy Clustering Analysis

Zoning the pollution of a river may be the first or even the most important step in water quality management. In order to resolve its pollution, fuzzy clustering analysis may be used whenever a composite classification of water quality incorporates mutiple parameters&#10 &#10In such cases, the technique may be used as a complement or an alternative to comprehensive assessment. In fuzzy cluster...

متن کامل

Water Quality Zoning of Rivers by the Technique of Fuzzy Clustering Analysis

Zoning the pollution of a river may be the first or even the most important step in water quality management. In order to resolve its pollution, fuzzy clustering analysis may be used whenever a composite classification of water quality incorporates mutiple parameters In such cases, the technique may be used as a complement or an alternative to comprehensive assessment. In fuzzy clustering ...

متن کامل

Application of a Self-Organizing Map for Clustering the Groundwater Quality in Kerman Province and Assessment its Suitability for Drinking and Irrigation Purposes

Evaluation of groundwater hydro chemical characteristics is necessary for planning and water resources management in terms of quality. In the present study, a self-organizing map (SOM) clustering technique was used to recognize the homogeneous clusters of hydro chemical parameters in water resources (including well, spring and qanat) of Kerman province; then, the quality classification of groun...

متن کامل

Regulation of Electrical Distribution Companies via Efficiency Assessments and Reward-Penalty Scheme

Improving performance of electrical distribution companies, as the natural monopoly entities in electric industry, has always been one of the main concerns of the regulators. In this paper, a new incentive regulatory scheme is proposed to improve the performances of electrical distribution companies. The proposed scheme utilizes several efficiency assessments and a 3-dimentional reward-penalty ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000